using Random
rng = MersenneTwister(1234)
rand(Int, (2, 3))2×3 Matrix{Int64}:
3575776591495240819 -2747454149308182041 7219936988107557934
4091584052188131325 8056621258017069489 2965676453800351206
How to generate synthetic data for your model using sub-models, with applications to economic scenario generation and portfolio composition.
Modern computers utilize pseudo-random number generators (PRNGs) to generate random-like numbers. PRNGs are algorithms used to generate sequences of numbers that appear to be random but are actually determined by an initial value, known as the seed. These generators are called “pseudo-random” because the sequences they produce are deterministic; if you provide the same seed, you’ll get the same sequence of numbers. In addition, they have a finite period, which means that after a certain number of generated values, the sequence will repeat. It’s important to choose or design PRNGs with a long enough period for practical applications.
Financial modelers should understand how PRNGs work because many financial models rely on Monte Carlo simulations, risk analysis, and other stochastic modeling techniques that require random sampling. A good PRNG is essential for robust financial modeling. Choosing the right PRNG ensures accurate, reproducible, and unbiased results with efficiency, which is critical when making financial decisions.
One of the strengths of the Mersenne Twister is its exceptionally long period. The period is \(2^19937 - 1\), which means it can generate \(2^19937 - 1\) pseudo-random numbers before repeating. This long period is crucial for applications requiring a large number of independent random numbers. It is also known for its good statistical properties. It passes many standard tests for randomness and provides a relatively uniform distribution of random numbers. Moreover, it is designed to allow multiple independent instances to be used concurrently without interfering with each other. This makes it suitable for parallel computing. Although there are faster generators for specific use cases, the Mersenne Twister is still often favored for its balance between speed and quality, and has been one of the recommended PRNGs for financial modeling purposes.
Xorshift is a family of PRNGs known for their simplicity and relatively fast operation. The name “xorshift” comes from the bitwise XOR (exclusive or) and bit-shifting operations that are the core of the algorithm. Xorshift generators are often used in applications where speed is a priority and cryptographic-strength randomness is not a strict requirement. Xorshift PRNGs use bitwise XOR, left shifts, and right shifts to update the internal state and generate pseudo-random numbers. The basic idea is to repeatedly apply these operations to the state to produce a sequence of numbers. The period of a typical xorshift generator is relatively short compared to some other PRNGs like the Mersenne Twister. However, there are variations of xorshift algorithms that can have longer periods. One of the main advantages of xorshift is its simplicity and speed. The bitwise XOR and bit-shifting operations can be efficiently implemented in hardware, making xorshift generators suitable for applications where fast random number generation is crucial.
Xoshiro is a family of PRNGs known for their high performance and good statistical properties. The name “Xoshiro” is derived from the Japanese word “xoroshiro,” meaning “random.” Xoshiro algorithms, including Xoshiro128 and others, use a combination of bitwise XOR, bit-shifting, and addition operations. They often have more complex update rules than basic Xorshift algorithms. In addition, they typically have longer periods, making them suitable for applications that require more pseudo-random numbers before repetition.
Julia offers a consistent interface for random numbers due to its design and multiple dispatch principles. Consider the following random numbers in different data types.
using Random
rng = MersenneTwister(1234)
rand(Int, (2, 3))2×3 Matrix{Int64}:
3575776591495240819 -2747454149308182041 7219936988107557934
4091584052188131325 8056621258017069489 2965676453800351206
using Random
rng = MersenneTwister(1234)
rand(Float64, (2, 3))2×3 Matrix{Float64}:
0.575801 0.0383262 0.924847
0.731216 0.420101 0.934174
using Random
rng = Xoshiro(1234)
rand(Bool, (2, 3))2×3 Matrix{Bool}:
0 1 0
1 1 0
Scenario generators are widely used in risk management, investment analysis, and regulatory compliance to model potential future outcomes. If the goal is forecasting actual market behavior, real world scenarios (RW) are commonly used. If, on the other hand, pricing financial instruments is needed, risk neutral (RN) scenarios are often used.
RW scenario generators are used to simulate market movements to estimate potential portfolio losses. Basel III regulatory capital requirements have adopted these approaches.
RW scenario generators can also be used to generate extreme but plausible market conditions to assess resilience, which is required by central banks and financial regulators (e.g., Federal Reserve and ECB).
RW scenario generators are used to simulate thousands of market conditions to determine optimal portfolio allocations which is commonly used in modern portfolio theory (MPT) and Black-Litterman models.
RW scenario generators can be used to simulate longevity risk, policyholder behavior, and interest rate movements. They are also used for conomic capital estimation under uncertain economic scenarios.
Central banks and institutions (e.g., IMF, World Bank) use RW scenario generators to predict macroeconomic trends.
RN scenario generators help value options using stochastic models (e.g., Black-Scholes, Heston model). They can help simulate future stock price movements under different volatility conditions. They can also be used for hedging purposes to test how a portfolio performs under different inflation, interest rate, or commodity price scenarios.
Yield curve modeling sses RN scenarios to value bonds and interest rate derivatives. Swaps, swaptions, and credit default swaps (CDS) also rely on RN pricing. RN scenario generators can also simulate yield curves for bond and fixed-income pricing. Models like Cox-Ingersoll-Ross (CIR) or Hull-White generate future interest rate paths.
IFRS 13 & fair value accounting uses RN models determine the market-consistent value of liabilities. Solvency II for insurers asks valuation of policyholder guarantees using RN scenarios.
Economic scenario generation involves the development of plausible future economic scenarios to assess the potential impact on financial portfolios, investments, or decision-making processes. Various approaches are used to generate economic scenarios, including stochastic differential equations (SDEs) and Monte Carlo simulations.
The Vasicek model is a one-factor model commonly used for simulating interest rate scenarios. It describes the dynamics of short-term interest rates using a stochastic differential equation (SDE). In a Monte Carlo simulation, we can use the Vasicek model to generate multiple interest rate paths. The CIR model is an extension of the Vasicek model with non-constant volatility. It addresses the issue of negative interest rates by ensuring that interest rates remain positive. Vasicek is defined as
\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \, dW(t) \]
where
And CIR is defined as
\[ dr(t) = \kappa (\theta - r(t)) \, dt + \sigma \sqrt{r(t)} \, dW(t) \]
where
The following code shows a simplified implementation of a CIR model. The specification of \(dr\) can be changed to make it a Vasicek model.
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# CIR model parameters
κ = 0.2 # Speed of mean reversion
θ = 0.05 # Long-term mean
σ = 0.1 # Volatility
# Initial short-term interest rate
r₀ = 0.03
# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000
# Time increment
Δt = 1 / 252
# Function to simulate CIR process
function cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)
interest_rate_paths = zeros(num_steps, num_simulations)
for j in 1:num_simulations
interest_rate_paths[1, j] = r₀
for i in 2:num_steps
dW = randn() * sqrt(Δt)
# for Vasicek
# dr = κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * dW
dr = κ * (θ - interest_rate_paths[i-1, j]) * Δt + σ * sqrt(interest_rate_paths[i-1, j]) * dW
interest_rate_paths[i, j] = max(interest_rate_paths[i-1, j] + dr, 0) # Ensure non-negativity
end
end
return interest_rate_paths
end
# Run CIR simulation
cir_paths = cir_simulation(κ, θ, σ, r₀, Δt, num_steps, num_simulations)
# Plot the simulated interest rate paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, cir_paths[:, i])
end
fThe Hull-White model is a one-factor model that extends the Vasicek model by allowing the mean reversion and volatility parameters to be time-dependent. It is commonly used for pricing interest rate derivatives. Brace-Gatarek-Musiela (BGM) Model extends the Hull-White model to incorporate more factors. It is one of the Libor Market Model (LMM) that describes the evolution of forward rates. It allows for the modeling of both the short-rate and the entire yield curve. It is defined as
\[ dr(t) = (\theta(t) - a r(t)) \, dt + \sigma(t) \, dW(t) \]
where
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# Hull-White model parameters
α = 0.1 # Mean reversion speed
σ = 0.02 # Volatility
r₀ = 0.03 # Initial short-term interest rate
# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000
# Time increment
Δt = 1 / 252
# Function to simulate Hull-White process
function hull_white_simulation(α, σ, r₀, Δt, num_steps, num_simulations)
interest_rate_paths = zeros(num_steps, num_simulations)
for j in 1:num_simulations
interest_rate_paths[1, j] = r₀
for i in 2:num_steps
dW = randn() * sqrt(Δt)
dr = α * (σ - interest_rate_paths[i-1, j]) * Δt + σ * dW
interest_rate_paths[i, j] = interest_rate_paths[i-1, j] + dr
end
end
return interest_rate_paths
end
# Run Hull-White simulation
hull_white_paths = hull_white_simulation(α, σ, r₀, Δt, num_steps, num_simulations)
# Plot the simulated interest rate paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, hull_white_paths[:, i])
end
fGBM is a stochastic process commonly used to model the price movement of financial instruments, including stocks. It assumes constant volatility and is characterized by a log-normal distribution. It is defined as
\[ dS(t) = \mu S(t) \, dt + \sigma S(t) \, dW(t) \]
where
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# GBM parameters
μ = 0.05 # Drift (expected return)
σ = 0.2 # Volatility
# Initial stock price
S₀ = 100
# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000
# Time increment
Δt = 1 / 252
# Function to simulate GBM
function gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)
stock_price_paths = zeros(num_steps, num_simulations)
for j in 1:num_simulations
stock_price_paths[1, j] = S₀
for i in 2:num_steps
dW = randn() * sqrt(Δt)
S = stock_price_paths[i-1, j]
dS = μ * S * Δt + σ * S * dW
stock_price_paths[i, j] = S + dS
end
end
return stock_price_paths
end
# Run GBM simulation
gbm_paths = gbm_simulation(μ, σ, S₀, Δt, num_steps, num_simulations)
# Plot the simulated stock price paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, gbm_paths[:, i])
end
fGARCH models capture time-varying volatility. They are often used in conjunction with other models to forecast volatility. It is defined as
\[ \sigma^2_t = \omega + \alpha_1 r^2_{t-1} + \beta_1 \sigma^2_{t-1} \]
\[ r_t = \varepsilon_t \sqrt{\sigma^2_t} \]
using Random, CairoMakie
# Set seed for reproducibility
Random.seed!(1234)
# GARCH(1,1) parameters
α₀ = 0.01 # Constant term
α₁ = 0.1 # Coefficient for lagged squared returns
β₁ = 0.8 # Coefficient for lagged conditional volatility
# Number of time steps and simulations
num_steps = 252
num_simulations = 1_000
# Time increment
Δt = 1 / 252
# Function to simulate GARCH(1,1) volatility
function garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)
volatility_paths = zeros(num_steps, num_simulations)
for j in 1:num_simulations
ε = randn(num_steps)
squared_returns = zeros(num_steps)
for i in 2:num_steps
squared_returns[i] = α₀ + α₁ * ε[i-1]^2 + β₁ * squared_returns[i-1]
volatility_paths[i, j] = sqrt(squared_returns[i])
end
end
return volatility_paths
end
# Run GARCH simulation
garch_paths = garch_simulation(α₀, α₁, β₁, num_steps, num_simulations)
# Plot the simulated volatility paths
f = Figure()
Axis(f[1, 1])
for i in 1:num_simulations
lines!(1:num_steps, garch_paths[:, i])
end
fSimulating data using copulas involves generating multivariate samples with specified marginal distributions and a copula structure.
using Random, CairoMakie, BivariateCopulas
# Set seed for reproducibility
Random.seed!(1234)
# Generate a Gaussian copula
gaussian_copula = Gaussian(0.8)
# Show simulated copula
f = scatter(rand(gaussian_copula, 10^4))
fCopulas can also be used to infer combined distributions from data samples.
using Copulas, Distributions, Random
X₁ = Gamma(2, 3)
X₂ = Pareto()
X₃ = LogNormal(0, 1)
C = ClaytonCopula(3, 0.7) # A 3-variate Clayton Copula with θ = 0.7
D = SklarDist(C, (X₁, X₂, X₃)) # The final distribution
# Generate a dataset
simu = rand(D, 1000)
# We may estimate a copula, or get parameters of underlying distributions, using the `fit` function:
D̂ = fit(SklarDist{ClaytonCopula,Tuple{Gamma,Normal,LogNormal}}, simu)SklarDist{ClaytonCopula{3, Float64}, Tuple{Gamma{Float64}, Normal{Float64}, LogNormal{Float64}}}(
C: ClaytonCopula{3, Float64}(
G: Copulas.ClaytonGenerator{Float64}(0.7255762179151387)
)
m: (Gamma{Float64}(α=1.9509359315325794, θ=3.0668504198367565), Normal{Float64}(μ=6.958764796293847, σ=27.415016590130424), LogNormal{Float64}(μ=0.01132053842187167, σ=1.0263584835287456))
)